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ABSTRACT  .  A  Monte  Carlo  algorithm  that  searches  for  the  optimal  docking 
configuration  of  hen  egg  white  lysozyme  to  an  antibody  is  developed.  'Both 
the  lysozyme  and  the  antibody  are  kept  rigid.  Unlike  the  work  of  other 
authors,  our  algorithm  does  not  atteiiq)t  to  expllcity  maximize  surface  contact, 
but  minimizes  the  energy  computed  using  coarse-grained  pair  potentials.  The 
final  refinement  of  our  best  solutions  using  all-atom  OPLS  potentials  consistently 
yields  the  native  conformation  as  the  preferred  solution  for  three  different 
antibodies.  We  find  that  the  use  of  an  exponential  distance-dependent  dielectric 
function  is  an  improvement  over  the  more  commonly  used  linear  form. 

Further  work  has  been  done  on  predicting  the  affinity  of  various  Avian  lysozymes 
for  a  couple  of  antibodies. 
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CXir  work  over  the  past  two  years  has  concentrated  on  the  study  of  five 
lysozyme-antibody  crystals.  The  coordinates  of  these  crystals,  and  vasts  amount 
of  data  on  the  affinity  of  lysozymes  of  various  birds  for  the  same  antibodies,  have 
allowed  us  to  study  the  interaction  of  the  two  proteins  involved  in  great  de^. 

In  particular  we  have  asked  the  following  two  questions:  (1)  can  we  develop  an 
doddng  algorithm  that  predicts  the  native  conformation  and  (2)  can  we  predict 
the  affinity  of  mutant  lysoz3nnes  for  an  antibody. 

To  address  the  first  question  we  developed  an  algoridim  that  first  generated 
about  10,000  docked  conformations  by  attaching  the  Ijrsozyme  to  a  hinge  point 
located  at  the  center  of  the  Complementary  Determining  Region  of  the  antibody. 
Using  a  Monte  Carlo  approach  these  conformations  were  successively  relaxed 
using  three  energy  functions  of  increasing  complexity.  The  first  energy  function 
was  based  on  an  approximate  Van  der  Waals  interaction.  The  second  was 
derived  from  a  statistical  analysis  of  known  aystals,  and  the  third  was  the  OPLS 
potential  developed  by  Jorgensen.  We  found  that  our  algorithm  consistently 
predicted  a  lysozyme  c^ormation  that  was  within  one  angstrom  rms  of  the 
crystal  structure.  These  results  were  published  in  the  April  1993  issue  of 
Proteins,  (copy  attached). 


In  the  past  year  we  have  concentrated  on  the  second  question,  attempting  to 
predict  the  affinity  of  various  avian  lysozymes  for  the  same  antibody  and  for  a 
second  antibody  which  binds  to  the  same  epitop>e  but  has  a  quite  different 
combining  site  loop  arrangement.  To  answer  tl^  question  we  refined  our 
docking  algorithm  and  alw  used  XPLOR  to  run  molecular  dynamics  on  die 
complexes  we  generated.  We  found  that  we  could  correctly  predict  the  relative 
binding  affinity  of  the  different  avian  lysozymes  to  the  antib^es  we  tested. 

For  the  cases  in  which  the  binding  affinity  was  low  we  could  also  explain  which 
particulzu:  interaction  caused  the  changes. 

This  work  has  been  written  up  in  a  joint  paper  with  our  collaborators  at  bistitut 
Pasteur. 
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Computer  Simulation  of  Antibody  Binding  Specificity 

MattM  PeUegrini*  and  Scbaatian  Doniach* 

Oepoitmenis  of ‘Physics  and  *  Applied  Physics,  Stanford  Universify,  Stan/brd,  California  94305-4090 


AJBSTRACT  A  Monte  Carlo  algorithm  that 
•earches  for  the  optimal  docking  configuration 
of  hen  egg  white  lyaoaynM  to  an  antibody  ia  de¬ 
veloped.  Both  the  lysoqrme  and  the  antibody 
are  kept  rigid.  Unlike  the  work  of  other  an- 
thora,  our  algorithm  doea  not  attempt  to  explic¬ 
itly  maximiie  aurface  contact,  but  minimiaea 
the  enmrgy  computed  uaing  coarae-grained  pair 
potentiala.  The  final  refinement  of  our  beat  ao- 
lutiona  uaing  aU-atom  OPLS  potentiala  (Jor- 
genaen  and  Tirado-Rivea*)  conaiatently  yielda 
the  native  conformation  aa  the  preferr^  aolu- 
tion  for  three  different  antibodiea.  We  find  that 
the  uae  of  an  exponential  diatance-dependent 
dielectric  Ainction  ia  an  improvement  over  the 
more  commonly  uaed  linear  form. 

O  19M  Witey-tlM.  IBC. 

Keyworda:  antigen-antibody  recognition, 
docking  algorithm,  diatance-de¬ 
pendent  dielectric 

INTRODUCTION 

Understanding  the  molecular  basis  for  specificity 
of  receptor-substrate  binding  in  general  and  im¬ 
mune  specificity  in  particular  is  a  problem  of  cur¬ 
rent  interest  in  molecular  biology.  Over  the  last  few 
years  the  structures  of  crystals  of  three  complexes  of 
antibodies  bound  to  different  epitopes  on  the  surface 
of  hen  egg  white  lysozyme  (HEL)  have  been  solved 
to  high  resolution.''^  These  provide  a  beautiful  test 
case  for  our  ability  to  model  antibody-protein  bind¬ 
ing  specificity,  since  there  is  evidence  that  the  con¬ 
formational  changes  which  take  place  upon  docking 
of  the  lysozyme  to  an  antibody  are  sn^l  (on  the 
scale  of  In  this  paper,  we  report  on  a  new 

approach  to  modeling  tite  specificity  of  antibody- 
lysozyme  binding  baaed  on  a  rigid  body  docking 
strategy. 

A  number  of  authors  have  recently  reported  com¬ 
puter  studies  of  docking.*''^  In  general,  their  algo¬ 
rithms  search  six-dimensional  phase  space  (3  trans¬ 
lations  and  3  rotations)  for  the  confimnation  that 
mazimizea  the  contact  area  between  the  proteins. 
However,  in  all  caaes  this  leads  to  a  large  number  of 
equally  good  candidates,  among  which  is  one  similar 
to  the  native  conformation  found  in  the  crystal.  The 
inoblem  of  distinguishing  the  beat  one  among  these 
solutions  was  addressed  by  Cherfils  et  al.*  and  by 
Shoichet  and  Kuntz.”  Both  groups  uaed  interatomic 


potentials,  including  electrostatic  terms,  to  refine 
the  best  solutions.  Shoichet  and  Kuntz*  also  cmn- 
puted  the  buried  surface  area  and  the  solvation  firee 
energy.  Althou^  the  native  solution  was  among  the 
very  best,  neither  group  could  disqualify  all  incor¬ 
rect  solutions.  A  few  conformations,  for  from  the  na¬ 
tive  one,  had  energies  very  cloae  to  or  even  lower 
than  the  native  one. 

We  have  addressed  this  problem  and  have  devel¬ 
oped  a  strategy  that  consistently  filters  out  non-na¬ 
tive  solutions  and  selects  the  native  soluti<m  for  all 
three  antibody-lysozyme  complexes.  The  algorithm 
we  constructed  to  generate  docking  conformations 
uses  successively  more  sophisticated  forms  of  inter- 
molecular  potentiala  inst^  of  explicitly  trying  to 
maximize  surface  contact  The  best  solutions  gener¬ 
ated  in  this  manner  are  then  refined  uaing  the  OPLS 
potentials,  created  by  Jorgensen  and  Tirado-Rives.* 
We  find  that  the  binding  energies  of  the  conforma¬ 
tions  generated  at  this  stage  of  the  search  are  very 
sensitive  to  the  method  uaed  to  simulate  dielectric 
screening  of  the  Coulomb  potential.  Tlw  most  suc¬ 
cessful  strategy  we  found  was  to  introduce  a  dis¬ 
tance-dependent  dielectric  constant  leading  to  an 
exponentially  screened  Coulomb  potential  with  a 
characteristic  len^  of  3  A,  and  a  switching  func¬ 
tion  from  9  to  10  A.  This  potential,  when  applied  to 
optimize  the  binding  of  our  top  ten  solutions,  yielded 
as  the  top  answer  a  conformation  within  1  A  rms  of 
the  native.  A  gap  corresponding  to  rou^y  20%  of 
the  top  binding  energy  appeared  between  the  best 
and  second  best  solutions. 

It  is  important  to  stress  that  our  docking  algo¬ 
rithm  (like  thoee  of  other  authors)  does  not  conq>ute 
the  binding  affinity  of  the  antigen  to  the  antibody. 
Such  a  calculation  would  have  to  include  effects  due 
to  changes  in  hydration  and  changes  in  entropy 
upon  binding.'*’'*  Our  selecti<m  criteria  are  bas^ 
o^y  on  comparison  of  binding  energies  whidi  in- 
clu^  electrostatic  and  van  der  Waals  effiscts.  One  of 
our  conclusions  is  that  the  electrostatic  component 
of  protein-protein  interactums  plays  a  significant 
role  in  determining  immune  qw^city. 

In  the  following  sections,  we  give  details  of  our 
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TABLE  1.  Example  of  Hinge  Potato  for  tiia  Three  Complezee* 


Complex 

Atom 

type 

Residue 

type 

Residue 

number 

Heavy  or 
light  chain 

Distance 
to  (A) 

hinge  point 

HYIO 

CD2 

TYR 

33 

H 

4.8 

HEl 

TRP 

98 

H 

2.1 

HH 

TYR 

50 

L 

4.8 

HY5 

CH2 

TRP 

90 

H 

3.3 

HEl 

TRP 

33 

L 

4.7 

OEl 

GLU 

50 

L 

2.8 

D1.3 

CA 

ASP 

100 

H 

4.3 

OD2 

ASP 

100 

H 

1.6 

CEl 

TYR 

101 

H 

4.6 

*The  hinge  poinu  selected  for  three  of  our  run*  are  given  in  tenn*  of  their  distance*  from  three 
reference  atoms.  A*  mentioned  above,  hinge  points  within  ^  2.5  A  of  the  above  ones  will  also  allow 
the  algorithm  to  function  well. 


searching  strategy.  In  particular  we  will  discuss  the 
two  novel  components  of  our  algorithm:  (1)  use  of  a 
set  of  intermediate  binding  energy  criteria  to  reject 
unfavorable  search  paths,  and  (2)  use  of  a  phenom¬ 
enologically  screened  Coulomb  potential  which 
smooths  out  barriers  between  local  minima  at  the 
final  stage  of  the  search.  We  will  also  discuss 
whether  our  novel  use  of  an  exponentially  screened 
dielectric  function  is  generalizable  to  other  prob¬ 
lems,  or  whether  it  is  a  computational  device  specific 
to  our  situation. 

METHODS 
Docking  Algorithm 

Our  docking  algorithm  is  conceptually  very  sim¬ 
ple.  It  can  be  viewed  as  a  series  of  three  filters,  each 
of  which  selects  only  10%  of  the  conformations  fed 
into  it;  the  final  solutions  are  then  refined  using 
all-atom  OPLS  potentials.  We  will  first  describe  the 
method  used  to  obtain  the  initial  conformations,  and 
then  describe  each  filter  in  detail. 

Selection  of  Inltinl  Conformations 

As  discussed  by  Tramontano  et  al.^  the  compli- 
mentarity  determining  region  (CDR)  of  an  antib^y 
is  formed  by  six  loops:  LI,  L2  and  L3  are  part  of  the 
variable  domain  of  the  ligdit  chain,  and  HI,  H2  and 
H3  are  part  of  the  variable  domain  of  the  heavy 
chain.  We  select  a  region  approximately  5  A  wide 
between  the  H3  and  L3  loops.  We  allow  this  region 
to  span  an  area  between  2  and  7  A  firom  the  surface 
of  the  antibody.  From  within  this  region  we  select  a 
hinge  point.  For  examples  of  hinge  points  used  in 
our  simulations,  see  Table  I.  Below  we  will  discuss 
the  dependence  of  the  algorithm’s  performance  on 
the  location  of  the  hinge  point  within  this  region. 
Note  that  the  choice  of  hinge  point,  while  localizing 
the  binding  site  to  be  in  the  vicinity  of  the  CDR,  does 
not  require  any  knowledge  of  the  antigen’s  native 


docked  conformation,  and  so  it  does  not  bias  the  sim¬ 
ulation  towards  such  a  conformation. 

We  then  attach  each  of  a  selected  set  of  “fiducial” 
atoms  (to  be  defined  below)  in  the  lysozyme  molecule 
to  this  point,  and  rotate  the  lysozyme  so  that  its 
center  of  mass  is  as  far  from  the  antibody  as  possi¬ 
ble.  For  each  fiducial  atom  we  then  rotate  the 
lysozyme  in  30  degree  steps  around  the  axis  formed 
by  the  initial  point  and  the  center  of  mass  of  the 
antibody  fragment.  To  make  sure  that  we  are  not 
biasing  our  initial  configurations  we  rotate  the 
lysozyme  by  an  arbitrary  fixed  angle  around  the 
same  axis  before  8q>plying  the  30  degree  rotations. 
We  ran  the  algorithm  for  various  values,  between  0 
and  30  degrees,  of  this  angle.  This  yields  approxi¬ 
mately  10,000  initial  conformations. 

The  selection  of  our  initial  conformation  by  this 
method  has  two  important  benefits.  Because  our 
hinge  point  is  located  at  the  ai^roximate  center  of 
the  antibody’s  antigen  binding  site,  and  only  a  few 
angstroms  ^m  the  surface,  we  are  sure  that  it  must 
lie  within  the  lysozyme  when  it  is  in  its  native  dock¬ 
ing  conformation.  Furthermore,  since  by  the  above 
procedure  we  create  10,000  initial  conformations 
aligned  with  the  center  of  mass  away  from  the  sur¬ 
face,  we  are  sure  that  at  least  a  few  are  close  (within 
10  A  rms)  to  the  native  conformation. 

Fiducial  Atoms 

The  set  of  fiducial  atoms  are  selected  from  a  list 
that  attempts  to  choose  inedominantly  atoms  in¬ 
volved  in  surface  interactions,  such  as  those  that 
form  strong  dipoles  or  are  at  the  extreme  end  of  a 
side  chain.  In  all,  this  includes  58  atom  types.  We 
include  the  main  chain  oxjrgen  and  nitrogen,  as  well 
as  the  ox^ns,  nitrogens,  and  hydrogens  from  polar 
side  chains.  In  nonpolar  residues  we  typically  select 
the  carbon  most  dikant  from  the  alp^  carbon.  For 
a  complete  listing  of  fidtttial  atonu,  see  Table  II.  The 
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TABLE  n.  List  of  “Fkluciia”  Atoms  Used  in 
Coarse-Grained  Pair  Potentials* 


Residue 

type 

Side  chain  “fiducial  atoms” 

ALA 

CB 

ARG 

NHl,  HHll,  HH12.  NH2,  HH21,  HH22.  NE 

ASN 

ODl.  ND2,  HD21.  HD22 

ASP 

ODl,  OD2 

CYS 

SG 

GLN 

OEl,  NE2,  HE21.  HE22 

GLU 

OEl,  OE2 

ms 

NDl,  NE2 

ILE 

CDl,  CGI 

LEU 

CDl,  CD2 

LYS 

CD,  NZ,  HZl,  HZ2,  HZ3 

MET 

SD,  CE 

PHE 

CDl.  CD2,  CZ 

PRO 

CG 

SER 

OG,  HG 

THR 

OGl,  HGl,  CG2 

TRP 

CZ2,  CZ3.  NEl,  HEl,  CB 

TYR 

OH,  CDl,  CD2,  HH 

VAL 

CGI,  CG2 

*Li8t  of  side  chain  “fiducial”  atoms  used  in  our  coarse-grained 
potentials.  The  main  chain  “fiducial"  atoms  include  oxygen  and 
nitrogen  for  all  residues,  and  the  alpha  carbon  for  glycine. 


“flduciar  atoms  account  for  approximately  half  of 
the  total  atoms  of  lysozyme  and  the  antibodies. 

The  “essential”  hydrogens,  those  that  are  charged 
and  participate  in  hydrogen  bonds,  are  added  to  our 
Brookhaven  Protein  Data  Bank  (POB)  files  using 
standard  bond  angles  and  lengths  by  the  program 
SYBYL.*®  The  hydrogens  that  are  covalently 
bonded  to  carbons  are  not  explicitly  included,  and 
the  carbon  is  treated  as  an  “extended”  atom,  follow¬ 
ing  the  convention  used  in  CHARMM.'^ 

Filters 

We  allow  our  initial  conformations  to  relax  using 
a  Monte  Carlo  Metropolis  algorithm®  with  a  very 
simplified  interatomic  potential  (only  between  fidu¬ 
cial  atoms)  designed  to  avoid  steric  clashes  and  max¬ 
imize  contact  area.  This  potential  is  equally  repul¬ 
sive  for  all  interatomic  distances  less  than  2  A,  and 
attractive  for  distances  less  than  five,  and  zero 
thereafter.  The  magnitude  of  the  repulsive  compo¬ 
nent  is  3.0  Kcal/mole  and  -0.2  for  the  attractive 
one. 

In  our  first  stage  we  allow  the  10,(X)0  initial  con¬ 
formations  to  minimize  for  50  Monte  Carlo  moves 
selected  using  the  Metropolis  algorithm  from  a 
range  of  0.5  A  in  translation  and  2  degrees  in  rota¬ 
tion.  We  have  found  that  one  obtains  more  accurate 
results  if  each  conformation  is  allowed  to  relax 
twice,  starting  from  the  same  initial  state  but  using 
different  random  number  seeds,  for  50  time  steps, 
instead  of  once  for  100  time  steps.  Therefore,  ini¬ 
tially  we  perform  20,000  relaxations  of  fifty  time 
steps.  The  temperature  is  kept  fixed  throughout  the 


run,  and  set  to  a  value  that  approximates  room  tem¬ 
perature  (kT  =  0.6  Kcal/mole). 

In  the  next  stage  we  select  the  top  10%  of  these 
solutions,  and  minimize  for  another  50  time  steps 
using  coarse-grained  statistical  potentials.  These 
are  generalizations  of  potentials  deBned  by  Wilson 
and  Doniach^®  that  include  side  chain  information 
(see  also  SippP®).  The  potentials  are  referred  to  as 
“statistical"  because  they  are  derived  from  an  anal¬ 
ysis  of  pair  correlations  between  “fiducial”  atoms  in 
the  Brookhaven  Protein  Data  Bank.  Since  PDB  files 
do  not  include  hydrogen  atoms,  our  correlations  are 
measured  between  all  possible  pairs  of  nonhydrogen 
atoms  in  our  list  of  “fiducial”  atoms.  This  leads  to 
800  potentials  in  all.  The  interactions  between  hy¬ 
drogen  and  other  atoms  are  not  included  in  this 
stage,  but  are  included  in  the  next  filter.  The  corre¬ 
lations  are  used  to  generate  effective  potentials  by 
taking  the  natural  logarithm  of  the  resulting  distri¬ 
bution  functions  and  normalizing  to  zero  at  10  A 
interatomic  distance.  The  total  interaction  energy 
for  a  given  configuration  is  then  set  equal  to  the  sum 
of  potentials  for  all  applicable  interprotein  fiducial 
atom  pairs.  Since  these  potentials  are  constructed 
from  empirical  distribution  functions  we  believe 
that  they  incorporate,  in  an  approximate  way,  ef¬ 
fects  due  to  electrostatic  and  hydrophobic  interac¬ 
tions. 

The  energies  of  the  top  10%  of  these  solutions  are 
then  further  minimized  using  the  previous  statisti¬ 
cal  potentials  to  which  have  been  added  hydrogen 
bonding  terms  from  the  OPLS  potentials  (in  the 
OPLS  potentials  the  hydrogen  bonds  are  not  given 
special  treatment;  they  are  represented  by  a  Len- 
nard-Jones  6,  12  potential).  We  weigh  the  value  of 
the  hydrogen  bon^  very  heavily  with  respect  to  the 
statistical  interactions:  each  hydrogen  bond  is  arbi¬ 
trarily  multiplied  by  a  factor  of  ten,  while  the  sta¬ 
tistical  pair  potentials  are  kept  equal  to  their  value 
describe  above.  This  allows  this  filter  to  primarily 
select  conformations  that  are  favorable  in  terms  of 
hydrogen  bonding. 

Up  to  this  point,  all  the  pair  potentials  had  been 
computed  between  “fiducial”  atoms  only.  In  the  final 
stage  we  apply  a  full-atom  description  of  the  anti¬ 
gen-antibody  complex.  The  final  top  ten  solutions 
obtained  by  the  above  procedure  were  minimized  for 
1,000  MC  steps  using  the  full  atom  OPLS  potentials. 
We  paid  particular  attention  to  the  meth^  used  for 
dielectric  screening  of  the  long  range  Coulomb  term, 
as  discussed  below.  Executing  the  entire  algorithm 
requires  approximately  30  hours  of  cpu  time  on  a 
DECstation  5000/200. 

Dielectric  Screening  of  the  Coolomb  Potential 

The  nature  of  dielectric  screening  in  proteins  and 
the  effects  of  the  solvating  water  have  been  dis¬ 
cussed  by  several  authors.**  *®’**’*®  In  general  it  will 
consist  of  a  term  due  to  the  reorientation  of  dipoles 
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both  in  the  water  and  in  the  protein,  an  electronic 
polarizability  term  and  a  term  resulting  from  ionic 
displacements  in  the  water.  In  principle,  if  the  elec¬ 
tronic  term  can  be  taken  care  of  by  a  homogeneous 
uniform  dielectric  constant,  the  effects  of  dipole  re¬ 
orientation  and  ionic  displacements  will  automati¬ 
cally  be  taken  care  of  if  a  full  molecular  dynamic 
simulation  is  performed,  with  resulting  dipolar  re¬ 
laxation.  However,  this  is  prohibitive  when  a  large 
number  of  docking  configurations  need  to  be  exam¬ 
ined.  Therefore,  a  number  of  workers  in  the  field 
have  proposed  distance-dependent  dielectric  func¬ 
tions  which  might  be  used  as  an  ad  hoc  method  to 
simulate  the  polarizability  of  proteins  including  the 
effects  of  the  solvent,  thus  allowing  rigid  body  esti¬ 
mates  of  Coulomb  interactions  between  proteins  or 
between  different  parts  of  a  given  protein  (reviewed 
in  reference  24).  Distance-dependent  dielectric  func¬ 
tions  which  have  been  proposed  include  a  linear 
form  c(r)  =  r  and  more  complex  forma,  e.g.,  a  form 
including  an  exponential  term  due  to  Mehler*'*  (see 
also  Warshel^^).  Other  authors  have  developed  ad¬ 
ditional  pairwise  energy  terms  that  explicitly  at¬ 
tempt  to  account  for  charge-solvent  interactions;^ 
however,  for  simplicity  we  have  not  yet  included 
such  terms  in  our  simulation. 

In  addition  to  the  problem  of  simulating  the  po¬ 
larizability  of  proteins  in  solution,  there  is  an  addi¬ 
tional  computational  problem  resulting  from  the 
long  range  of  the  Coulomb  potential,  even  when 
screened.  This  has  been  addressed  by  Brooks  et  al.  in 
CHARMM^^  by  introducing  a  switching  function  to 
cut  off  the  potential.  Alternately,  one  can  group  at¬ 
oms  to  form  neutral  subunits  which  then  interact 
through  dipoles  and  higher  order  poles  which  decay 
faster  than  1/r.  This  technique  is  also  addressed  in 
CHARMM,  althoui^  in  this  work  we  have  used  ex¬ 
clusively  the  simpler  switching  function  approach. 

We  have  found  that  the  screening  and  cutoff  meth¬ 
ods  used  are  critical  to  the  success  of  the  algorithm. 
For  instance,  we  were  unable  to  show  that  the  na¬ 
tive  solution  is  best  when  we  used  a  constant  dielec¬ 
tric,  regardless  of  its  value,  and  discontinuously  cut 
off  Coulomb  potential  at  10  A. 

In  our  simulations  we  tried  four  methods,  three 
from  CHARMM  and  one  developed  independently. 
The  methods  used  by  CHARMM  are: 

(a)  Constant  dielectric 
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We  have  found  a  fourth  useful  approximation: 
(d)  Exponentially  decaying  dielectric  screening: 


=  2  5! (4) 

1  J  fy 

where  is  measured  in  A.  In  all  these  cases  the 
switching  function  is  given  by 


=  1  when  s  r„,  (5) 


when  rgn<r^s  (6) 

=  0  when  >r„.  (7) 

In  Figure  1  we  plot  r^V(r)  as  a  function  of  r  for  the 
four  different  pcdentials.  It  can  be  seen  that  the  ma¬ 
jor  differences  between  exponential,  linear  and 
Mehler  dielectric  functions  (to  be  defined  later)  oc¬ 
cur  after  6  A  adiile  the  shifted  potential  has  a  com¬ 
pletely  different  form. 


(b)  Distance-dependent  dielectric  (linear) 
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RESULTS 

Selection  of  Preferred  Docking  Comidez 

We  ran  the  three  lysopyme-antibody  complexes 
throu^  the  first  three  filters  of  our  alg^thm  using 
the  hinge  points  listed  in  Table  I  and  nearby  hinge 
points  (within  3  A  rms).  From  these  stages  we  ob- 


(c)  Shifted  dielectric 
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TABLE  nL  Blndinf  EawgiM  Found  fat  Suqile  Buas  That  Uaad  an 
_ EapoaentiaUy  IncraaiiaR  Diatoctrte  Functfam* 


Complex 

Lowest 

energy 

rms 

distance 

Second 

lowest 

energy 

rms 

distance 

Fab  HyHELS-lysozyme 

-110 

0.8 

-77 

14.0 

Fab  HyHELlO-lysozyme 

-68 

0.6 

-49 

17.2 

Fab  DI. 3-lysozyme 

-73 

0.3 

-59 

17.0 

‘‘Binding  energie*  calculatad  for  the  OPLS  potentiaU  with  an  exponentially  acreenad  Coulomb 
potential  and  a  switching  function  at  9  A.  The  rms  values  are  calculated  only  for  atoms  within  15 
A  of  the  antibody.  This  is  used  to  avoid  misleadingly  large  rms  values  that  might  be  caused  by  the 
epitope  acting  as  a  pivot  around  which  the  lyscayme  can  rotate  by  small  angles.  The  units  of  the 
OPLS  potentials  are  Kcal/mole. 


TABLE  IV.  Bfaidfaig  Energies  Found  in  Sample  Runs  That  Used  a  Linear 

Dielectric  Function* 


Complex 

Lowest 

energy 

rms 

distance 

Second 

lowest 

energy 

rms 

distance 

Fab  HyHELS-lysozyme 

-107 

0.9 

-67 

14.3 

Fab  HyHELlO-lysozyme 

-56 

16.8 

-39 

1.8 

-64.7 

0.9 

-39 

13.4 

-63.2 

0.9 

-35 

18.2 

Fab  Dl. 3-lysozyme 

-70 

0.5 

-55 

16.7 

‘Binding  energies  for  the  OPLS  potentials  with  a  linear  dielectric  that  has  a  switching  function 
between  7  and  8  A.  In  the  case  of  HyHELlO,  the  OPLS  potential  with  this  dielectric  function, 
starting  from  different  but  nearby  hinge  points  (within  2  A),  was  not  able  to  consistently  fiitd  a 
solution  within  1  A  of  the  crystal. 


TABLE  V.  Binding  Energies  Found  in  Sample  Rune  That  Used  a  Shilled 

Dielectric  Function* 


Complex 

Lowest 

energy 

rms 

distance 

Second 

lowest 

energy 

rms 

distance 

Fab  HyHELS-lysozyme 

-42 

0.9 

-25 

13.8 

Fab  HyHELlO-lysozyme 

-20 

0.7 

-20 

14.0 

Fab  Dl.S-lysozyme 

-61 

0.7 

-55 

12.8 

‘Binding  energies  for  the  OPLS  potentials  with  a  diifted  dielectric  function  that  has  a  cutoff  at  7 
A.  In  the  case  of  HyHELlO,  use  of  this  dielectric  fuMtion  was  not  able  to  find  an  energy  gap 
between  the  native  solution  and  the  next  best  solution  with  large  rms  deviation. 


tained  a  list  of  the  top  ten  docked  conformations  for 
each  complex.  These  were  then  refined  using  the 
all-atom  OPLS  potentials  with  the  four  different 
phenomenologic^  dielectric  functions. 

The  results  for  the  simulations  are  summarized  in 
Tables  m-V,  for  the  various  dielectric  functions.  For 
the  case  of  a  constant  dielectric  function  we  set  the 
contributions  from  atoms  separated  by  a  distance 
greater  than  10  A  to  zero,  but  did  not  use  a  switch¬ 
ing  function.  We  did  not  include  a  table  of  results  for 
this  case  since  the  results  were  very  poor.  This  is  due 
to  the  fact  that  the  energy  contribution  from  atoms 
separated  by  10  A  is  an  order  of  magnitude  larger 
than  contributions  from  short  distances.  Thus,  the 
binding  energy  is  not  a  smooth  function  under  this 
method,  but  varies  wildly  from  one  Monte  Carlo 
move  to  the  next. 


To  achieve  a  well-defined  energy  surfiue  in  dock¬ 
ing  apace  while  limiting  the  number  of  atomic  pairs 
computed,  one  needs  to  cut  off  the  potentials 
smoothly.  In  Tables  III-V  we  report  some  sample 
luns  for  the  different  dielectric  frinctions.  When  we 
show  only  one  run,  this  implies  that  all  other  runs 
with  hinge  points  within  2  to  3  A  of  this  one  ob¬ 
tained  similar  results. 

As  seen  in  the  comparison  of  Tables  m-V,  the  best 
method  we  found  was  to  simulate  the  effects  of  the 
dielectric  polarizability  and  solvent  as  leading  to  an 
exponentfadly  deea)ring  potential.  Hie  other  two 
methods,  linear  and  sUfted  dielectric  functions, 
seemed  to  work  well  in  the  cases  of  the  HyHELS  and 
Dl.S-lysozyme  complexes,  but  did  not  consistently 
find  the  native  as  t^  top  solution  for  the  HyHELlO- 
lysozyme  complex.  For  the  case  at  HyHEL-10  and  a 
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linear  dielectric  we  report  three  runs  with  three  dif¬ 
ferent,  but  close  (within  2  A),  hinge  points.  As  can  be 
seen,  the  algorithm  failed  to  find  the  native  in  one  of 
these  runs,  although  when  it  did  find  a  native-like 
conformation  it  turned  out  to  be  the  one  with  lowest 
energy.  On  the  other  hand  the  shifted  dielectric 
function  consistently  found  a  native-like  solution 
but  also  found  solutions  that  were  far  from  the  na¬ 
tive  ( 14  A  rms)  but  had  equal  energy  (we  have  ver¬ 
ified  that  with  the  exponentially  increasing  dielec¬ 
tric  function  the  solution  closest  to  the  native  is 
indeed  lower  in  energy  than  these  solutions).  The 
exponentially  increasing  dielectric  function  instead 
found  the  native  binding  conformation  consistently 
for  all  three  antibodies,  within  a  1  A  rms  deviation. 
It  also  demonstrated  that  for  the  top  ten  solutions 
found  by  our  algorithm,  the  native  one  was  energet¬ 
ically  favorable  and  a  gap  of  at  least  20%  of  the 
native  binding  energy  appeared  between  the  top  two 
solutions. 

The  use  of  an  exponential  dielectric  function  also 
leads  to  solutions  that  were  closer,  in  terms  of  rms 
distance,  to  the  native.  For  instance,  the  top  solu¬ 
tions  for  the  D1.3  antibody  fotmd  using  the  exponen¬ 
tial  dielectric  function  was  within  0.3  A  rms  of  the 
native,  while  that  found  using  a  linear  dielectric 
function  was  within  0.5  A.  Thus,  screening  the  Cou¬ 
lomb  potential  with  an  exponential  dielectric  func¬ 
tion  is  more  effective  than  the  other  methods  both  in 
its  consistency,  as  seen  above,  and  in  its  ability  to 
select  conformations  that  are  closer  to  the  one  found 
in  the  crystal. 

As  stated  above,  we  ran  the  algorithm  several 
times  to  determine  the  dependence  of  the  results  on 
the  initial  hinge  point  and  the  initial  angle  of  rota¬ 
tion.  In  the  case  of  the  01.3  antibody-lysozyme  com¬ 
plex,  we  selected  three  hinge  points  that  varied  from 
2  to  7  A  from  the  surface,  along  the  axis  formed  by 
the  center  of  mass  of  the  antibody  and  the  first  hinge 
point.  In  all  cases  we  found  results  consistent  with 
those  stated  above  (i.e.,  a  conformation  within  1  A 
rms  of  the  native  being  at  least  20%  lower  in  energy 
than  all  other  conformations).  Similarly  we  allowed 
the  initial  angle  of  rotation  (this  is  the  rotation  we 
apply  to  all  our  conformations  before  applying  the 
12, 30  degree  rotations)  to  vary  from  0  to  30  degrees, 
in  5  degree  intervals,  in  the  above  simulations  and 
found  that  its  value  had  no  effect  on  the  results. 
Based  on  these  tests  we  feel  it  is  unlikely  that  we  are 
biasing  the  initial  conformations,  and  the  subse¬ 
quent  search,  towards  the  native  conformation. 

Hinge  points  selected  outside  the  range  described 
above  yielded  very  poor  results:  among  the  top  ten 
overall  solutions  none  were  close  to  the  native,  al¬ 
though  a  native-like  solution  was  among  the  next 
ten.  This  suggests  that  one  must  be  relatively  care¬ 
ful  in  the  selection  of  the  hinge  point,  but  with  the 
rather  large  range  of  successful  hinge  points  we 
found  (2  to  7  A  from  the  surface,  located  between  the 


100  -I - ^ 

f, 

i 


- expor.enttal 

—  ..near 


0 


-100  — 

I 

i 


I 

1 

I  I 

' 


-200 


-0 


4  6 

A 


10 


Fig.  2.  PM  o(  tlw  integrated  binding  energy  as  a  tunctlon  o) 
distance: 

vnhen  Y,(r)  is  me  potential  between  the  f'  atom  m  the 
lysozyme  and  ttie  /"  atom  in  the  antibody  that  are  separated  by 
a  distance  r.  The  integral  is  pertormed  lor  bom  the  exponential 
and  linear  dMadric  (unctions.  Note  that  the  two  begin  to 
diverge  only  alter  6  A,  and  than  by  orriy  5%. 


L3  and  H3  loops)  this  should  not  be  a  severe  limita¬ 
tion.  For  the  three  complexes  studied,  a  successful 
hinge  point  was  found  on  our  first  attempt. 

Comparison  of  Distance-Dependent 
Dielectric  Functions 

We  have  tried  to  understand  why  the  exponen¬ 
tially  increasing  dielectric  function  performs  better 
than  the  commonly  used  linear  dielectric  function. 
In  Figure  2  we  plot  the  integral  of  the  total  binding 
energy  as  a  function  of  the  distance  between  atomic 
pairs  that  have  been  included  in  the  calculation. 
Both  the  linear  and  exponential  dielectric  functions 
have  been  cut  off  with  a  switching  function.  We  note 
that  in  this  case  the  two  potentials  lead  to  similar 
energies,  and  as  expected,  the  small  divergence  be¬ 
tween  the  two  occurs  mainly  in  the  6  to  10  A  range. 
Similarly,  we  compared  the  binding  energies  of  the 
top  10  candidates  of  all  three  antibodies,  after  OPLS 
minimization,  using  both  the  linear  and  exponential 
dielectric  functions  and  in  all  cases  foimd  that  the 
energies  of  a  given  configuration  calculated  with  the 
two  dielectric  functions  did  not  differ  by  more  than 
10%.  Since  the  energies  of  a  given  configuration 
measured  by  the  two  dielectric  functions  are  not  sig¬ 
nificantly  different,  the  difference  in  performance 
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Bg.  3.  Plot  o(  the  accumulated  energy  integrated  with  respect 
to  distance  versus  distance  (calculated  without  using  a  switching 
function  to  cut  off  the  dielectric  function): 

“  allpatn 

SE^  Mr  'S  given  in  equation  (12).  The  energy  is  computed 
between  atoms  grouped  by  amino  iKid  units,  and  the  distance 
represents  the  length  in  Angstroms  between  the  centers  of 
mass  of  the  amino  acids. 


must  be  due  to  difTerences  in  the  algorithm’s  sam¬ 
pling  of  the  phase  space  when  using  the  two  func¬ 
tions. 

To  get  a  better  idea  of  this  effect,  we  plot  in  Figure 
3  the  contributions  to  the  total  energy  grouped  by 
amino  acid  units,  as  a  fimction  of  their  separation, 
when  no  switching  function  is  used: 

bFoopoiAr)  =  Vliioai  —  Raa2^ 

oaUi) 

Qa2{f) 

8(r  -  -  Road)  (8) 

where  aal(i),  aa2(j)  are  atoms  in  amino  acids  1  and 
2  and  Raatj2  ^  centers  of  mass  of  the  two  amino 
acids.  As  shown  in  the  figure,  the  linear  and  expo¬ 
nential  dielectric  functions  lead  to  virtually  indis¬ 
tinguishable  total  energies.  The  main  difference  ap¬ 
pears  to  be  that  the  exponential  form  tends  to 
smooth  out  variations  in  energy  versus  distance. 
This  has  the  effect  of  lowering  barriers  which  get  in 
the  way  of  energy  minimization  using  the  Monte 
Carlo  moves. 

We  have  verified  this  fact  by  studying  the  basins 
of  attraction  under  the  two  functions.  We  started  the 
lysozyme  in  the  native  configuration  of  the  D1.3- 
ffiL  complex.  We  then  moved  the  lysozyme  from 
this  configuration  by  pulling  it  away  from  the  sur¬ 


face,  in  steps  of  1  A.  After  100  steps  of  relaxation  at 
room  temperature,  the  simulations  conducted  with 
the  exponential  dielectric  function  settled  back  into 
the  native  basin  from  as  far  away  as  3  A,  while  those 
conducted  with  the  linear  dielectric  function  got 
stuck  in  shallow  local  minima  when  started  I  A 
away.  Although  after  a  lengthy  relaxation  the  sim¬ 
ulations  using  a  linear  dielectric  function  will  even¬ 
tually  settle  into  the  native  basin,  those  that  use  the 
exponential  dielectric  function  are  significantly 
more  efficient;  in  the  runs  above,  when  carried  out 
for  more  than  100  steps,  the  exponential  dielectric 
function  requires  two  to  three  times  fewer  moves  to 
settle  into  ^e  native  basin. 

Thus,  we  believe  that  the  improved  performance 
we  find  using  the  exponential  dielectric  function  re¬ 
sults  from  its  smoothing  effect  on  the  contributions 
to  the  interaction  energy  from  atoms  separated  by 
distances  greater  than  6  A,  and  the  subsequent  low¬ 
ering  of  barriers.  Because  the  linear  dielectric  func¬ 
tion  leads  to  noisy  contributions  due  to  the  accumu¬ 
lated  energy  of  interaction  at  long  distances,  we 
found  that  one  had  to  switch  it  offbetween  7  and  8  A 
to  obtain  the  best  results.  On  the  other  hand  the 
exponential  dielectric  function  performed  best  when 
switched  off  between  9  and  10  A. 

We  found  that  the  3  A  decay  length  for  the  expo¬ 
nentially  increasing  dielectric  used  in  our  simula¬ 
tions  was  fairly  critical.  When  extended  to  4  A,  the 
simulations  were  no  longer  successful,  and  similarly 
for  decay  lengths  below  2  A.  Note  that  the  exponen¬ 
tial  form  used  here  does  not  represent  a  Debye- 
Huckel  screened  potential  at  a  pcurticular  value  of 
ionic  strength.  Instead  our  dielectric  function  pro¬ 
vides  a  very  simplified  model  of  effects  due  to  elec¬ 
tronic  polarizability  and  dipolar  relaxation  within 
the  protein  and  in  the  solvent. 

DISCUSSION 

Our  algorithm  has  three  features  which  dilr  .-r 
from  those  of  other  authors  addressing  the  problem: 
(1)  the  use  of  a  hinge  point  near  the  center  of  the 
antigen-binding  site  that  is  used  to  determine  the 
location  of  the  initial  conformations;  (2)  the  use 
of  a  binding  energy  test  based  on  coarse-grained  pair 
potentials  in  the  selection  process  for  candidate 
docking  conformations;  and  (3)  the  introduction  of 
an  exponentially  increasing  dielectric  function  in 
rigid  body  protein-protein  interactions  to  simulate 
screening. 

As  discussed  above,  the  use  of  a  hinge  point  allows 
us  to  generate  in  an  unbiased  way  initial  conforma¬ 
tions  that  include  conformations  close  to  the  native. 
This  has  the  advantage  of  allowing  iu  to  cut  down 
the  length  of  searches:  we  relax  oonfigiuntions  for 
only  50  steps  in  the  first  three  stages.  We  are  also 
assured  that  our  method  will  exclude  all  solutions 
that  do  not  have  a  part  of  the  lysozyme  lying  within 
the  hinge  point. 
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We  believe  part  (2)  provides  an  improvement  over 
algorithms  that  select  conformations  by  maximizing 
surface  contact:  our  potentials  tend  to  eliminate  can¬ 
didates  which  may  be  favorable  in  terms  of  contact 
area  but  unfavorable  energetically.  However,  the  al¬ 
gorithms  of  Shoichet  and  Kuntz^  and  Cherflls  et  al.* 
find  solutions  which  are  both  favorable  in  terms  of 
contact  area  and  energetically.  Why  are  these  solu¬ 
tions  not  found  by  our  algorithm?  We  conjecture  that 
these  solutions  may  be  unfavorable  in  terms  of  a 
third  criterion:  size  of  basin  of  attraction.  We  as¬ 
sume  that  our  strategy  eliminates  candidates  with  a 
narrow  basin  of  attraction  at  an  early  stage  in  favor 
of  ones  with  large  basins  of  attraction.  This  is  due  to 
the  fact  that  we  allow  our  initial  configurations  to 
relax  using  a  coarse-grained  energy  criterion  for  a 
limited  search  (fifty  time  steps)  in  the  first  stage, 
thus  making  it  unlikely  for  the  lysozyme  to  settle  in 
a  narrow  basin.  On  the  other  hand,  a  search  based 
on  a  criterion  of  maximizing  surface  contact  may 
find  a  basin  which  is  narrow  on  energy  grounds  but 
may  appear  much  broader  based  on  a  contact  crite¬ 
rion.  llius,  we  are  lead  to  believe  that  the  basins  of 
attraction  that  we  find  when  using  our  coarse¬ 
grained  pair  potentials  more  accurately  reflect  the 
properties  of  the  real  system  than  do  the  basins  of 
attraction  generated  on  geometric  grounds. 

Our  use  of  an  exponentially  increasing  distance- 
dependent  dielectric  function,  part  (3),  also  appears 
to  be  an  improvement  over  the  standard  linear  di¬ 
electric,  at  least  in  the  case  of  these  three  complexes, 
in  its  ability  to  find  a  native-like  configuration  more 
consistently.  As  discussed  above,  this  improvement 
seems  not  to  be  due  to  its  ability  of  resolving  the 
binding  energy  of  the  native  relative  to  that  of  the 
false  solutions,  but  rather  due  to  the  lowering  of 
energy  barriers  between  local  minima.  Thus,  as  we 
have  shown,  the  lysozyme  is  able  to  sample  phase 
space  more  efficiently  than  would  be  possible  with 
the  standard  linear  dielectric.  At  this  stage  we  can¬ 
not  say  whether  this  feature  is  an  artifact  which  is 
computationally  useful,  or  a  reflection  of  the  topog¬ 
raphy  of  the  real  docking  landscape  found  in  the 
physical  system. 

We  note  that  the  use  of  a  phenomenological  dis¬ 
tance-dependent  dielectric  function  constitutes  a 
substantial  simplification  of  the  real  problem.  We  do 
not  explicitly  include  water  in  our  simulation,  and 
we  treat  the  protein  as  a  rigid  body.  Thus,  one 
should  be  wary  of  generalizing  the  use  of  distance- 
dependent  dielectric  functions  of  this  type  to  other 
problems.  However,  our  work  provides  an  additional 
piece  of  evidence  that  suggests  that  these  potentials 
are  an  effective  representation  of  the  antibody- 
lysoz}rme  interaction  in  the  case  of  our  three  com¬ 
plexes.  In  this  context  it  is  interesting  to  note  that 
our  distance-dependent  dielectric  function  is  similar 
to  one  used  by  Mehler  and  Solmajer  to  describe  pK 
shifts  within  a  protein.'^  Their  function  is: 


«(r)  =  A  +  Bl[\  +  Aexp(-\flr)].  (9» 

For  small  values  of  r  (r  <  10  A)  their  dielectric  also 
has  an  exponentially  increasing  behavior,  with  a 
length  constant  very  close  to  our  value  of  3  A.  The 
success  of  their  dielectric  function  in  describing  pK 
shifts  and  the  ability  to  obtain  similar  results  as 
simulations  that  include  water  explicitly, suggests 
that  our  function  might  have  wider  applicability 
than  that  examined  in  this  paper.  However,  we  have 
not  as  yet  attempted  to  apply  our  dielectric  function 
to  other  problems. 

Finally,  the  fact  that  the  success  of  the  algorithm 
is  very  sensitive  to  the  treatment  of  long  range  (6  to 
10  A)  interactions  leads  us  to  conclude  that  stenc 
“fit”,  while  probably  a  necessary  condition  for  high 
binding  affinity,  is  not  a  sufficient  criterion  for  se¬ 
lectivity.  Indeed,  even  the  identity  of  the  amino  ac¬ 
ids  forming  the  surface  epitope  may  not  provide  a 
complete  selectivity  criterion  and  mutation  of  resi¬ 
dues  buried  in  the  antigen  or  antibody  could  lead  to 
changes  in  binding  specificity. 

SUMMARY 

We  have  constructed  a  docking  algorithm  that 
successfully  finds  the  native  conformation  (within 
0.3  A  rms)  for  three  different  antibodies  to  lysozyme 
and  consistently  finds  it  to  be  of  substantially  lower 
energy  (20%)  than  any  other  docking  solution  gen¬ 
erated.  This  is  in  contrast  to  the  results  of  other 
docking  algorithms  published  to  date,  which  all 
found  non-native  conformations  that  were  energeti¬ 
cally  comparable  to,  or  even  more  tightly  bound 
than  solutions  near  the  native  one.  We  attribute  the 
success  of  our  algorithm  to  the  fact  that:  ( 1)  it  gen¬ 
erates  starting  conformations  that  in  an  unbiased 
way  allow  for  efficient  relaxation;  (2)  rather  than 
searching  for  solutions  on  the  geometric  grounds  of 
maximization  of  contact  area  between  antibody  and 
antigen,  it  screens  solutions  by  filtering  them 
through  an  energy  selection  criterion  based  on  three 
different  sets  of  coarse  grained  pair  potentials;  and 
(3)  it  uses  an  exponentially  increasing  dielectric 
function  that,  in  the  three  cases  examined,  allows 
our  algorithm  to  search  phase  space  and  locate  the 
native  conformation  more  consistently  than  the 
standard  linear  dielectric  function. 
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